XXX feat(controller): permit setting GUNICORN_WORKERS #75

kingdonb · 2018-08-25T17:39:38Z

permit setting GUNICORN_WORKERS and CONN_MAX_AGE from global values.yaml

(an additional pull request to the umbrella 'workflow' chart will follow)

kingdonb · 2018-08-25T17:41:19Z

did I tag this correctly? it's a charts-only change...

kingdonb · 2018-08-25T18:17:19Z

... it's not charts-only anymore

Cryptophobia · 2018-09-04T19:45:35Z

This looks good. Have you tested it by spinning up a controller with your changes connected to the database?

Looks good but we should test it just in case the charts broke.

You should be able to package the controller charts that have changed. Define a values.yml and pass in the GUNICORN_WORKERS and CONN_MAX_AGE defined in that file :

$ helm package .
$ helm install ${controller_charts_tgz_package_name} --namespace deis --values value.yml
$ helm fetch hephy/database
$ helm install database-v2.6.0.tgz --namespace deis

kingdonb · 2018-09-04T22:50:19Z

I will give it a shot and report back. Thanks for the guidance!

kingdonb · 2018-09-05T00:45:36Z

Hmm, that might have worked in a vacuum, but I tried upgrading my existing deployment and I ran into a number of issues.

For one thing, slugrunner-config appears to be a dependency of controller (did not know that)

Also noticed that doing a deploy this way without going through deisrel, directly from a branch of the git repo, still reaches out to quay.io/deisci/controller:canary -- that's not the right place to pull from! Oops...

I have a feeling that the patch is fine, but I have a homework assignment now and will take some time to verify it. Meanwhile I noticed this one thing missing that should really be added... ade2319

Cryptophobia · 2018-09-05T15:30:52Z

@kingdonb It should be best done to deploy Hephy Workflow 2.19.4 on minikube and then do the helm commands above. You will also need to build a new Docker image and push it somewhere and update the charts to pull that custom image that contains the change.

Also the build is failing and I restarted it. Let's see...

Cryptophobia · 2018-10-15T16:06:27Z

Hey @kingdonb , do you think you will have some time to take a look at this again?

kingdonb · 2018-10-21T16:09:18Z

Yes, I would like to... but I don't know if I will be able to get to it before 11 days from now, when the Open Roadmap is scheduled. I don't know if someone else can test it?

I'm setting a reminder to check on this in 4 days, hopefully when that comes I'll be able to spend the time and test it. I just can't right now.

Cryptophobia · 2018-10-21T17:08:24Z

I can also try to test PR at some point if I can get to it, but I will leave it to you for now as there are some other things that are more pending.

kingdonb · 2018-10-26T13:09:41Z

I'm going to get around to this, let's say by Wednesday

If I don't, then you can rib on me Thursday at the Open Roadmap meeting.

I also said I would look at teamhephy/workflow-e2e#18

Cryptophobia · 2018-10-28T00:38:33Z

Whenever you have some free time to devote to this @kingdonb . I know you have been busy. 🐝

EDIT: I am assuming this has to mostly just be tested and as long as it doesn't break any existing functionality it should be good to merge in.

kingdonb · 2018-10-31T02:08:02Z

This time it passed the tests. I still haven't manually tried all four of the examples I wrote just to be sure... maybe tomorrow.

# Setting the value to "None" will ensure that connections are never timed out.
# - 0
# - 0
# - 600
# - "None"
# conn_max_age: 600

Cryptophobia · 2018-10-31T17:38:11Z

charts/controller/values.yaml

+  # Setting the value to "None" will ensure that connections are never timed out.
+  # - 0 
+  # - 600
+  # - "None"


What is the default value of conn_max_age when no value is set in the chart? Sounds like it will be "None" by default... Is that the correct behaviour we want?

Cryptophobia · 2018-10-31T17:40:03Z

OK, let me know when it has been tested sufficiently and can be merged.

duanhongyi · 2018-11-01T00:59:49Z

Judging from the original intention of Docker, it does not recommend running multiple processes in a single container.

So I suggest using Scale instead of multi process.

@Cryptophobia @kingdonb

Cryptophobia · 2018-11-01T01:33:40Z

Judging from the original intention of Docker, it does not recommend running multiple processes in a single container.

So I suggest using Scale instead of multi process.

@Cryptophobia @kingdonb

I think in the original issue, the problem was that the database connections were not being closed when running connections exceeded the maximum value of GUNICORN_WORKERS. The controller is the main client of the database and we should be able to fine-grain control how many worker processes it can create on systems with lots of CPUs:

There should not be possible to run more gunicorn workers than max_connections in deis-database.

Yes, scaling containers is better than scaling the processes generally with container technologies. But here in this example we are not totally in charge of how Django uses Gunicorn, and so here we are dealing with the Gunicorn Server Architecture Model. I think here we are correct in how we are fine-tuning the worker threads with these two variables. Gunicorn Server Model. This is very similar to how other process managers work - Resque-pool, Unicorn, Puma, PM2, and nginx to name a few. This concurrent multi-threading of worker processes is actually a benefit inside containers as it saves memory. If we have two threads and one master process, one thread sleeps while one thread processes a request. The general benefit is that both threads will share memory and resources passed down from the master thread. In the very least, they will share the dynamic libraries that they use as well as the memory that they reserve to process requests.

duanhongyi · 2018-11-01T05:58:21Z

Agree with you.

Python has globally interpreted locks, so the threading model is not a good idea; gevent / event is a good solution in high IO situations. I wanted to use Gunicorn + gevent to run the controller, but then I thought it might not be necessary because the controller should not be a performance bottleneck.

Cryptophobia · 2018-11-01T12:43:27Z

Python has globally interpreted locks, so the threading model is not a good idea; gevent / event is a good solution in high IO situations.

Yes, exactly. Python GIL is part of the interpreter and there is no way around it. Controller is still IO bound and thus CPU is still not the deciding factor in how many requests can be handled without being blocked. If users with many CPU machines want to run the controller they should be able to scale the gunicorn workers and fine-tune them as this would normally be useful for them for maximizing concurrency and parallelism of these threads.

Good article on how the different types of workers behave in gunicorn.

Cryptophobia · 2020-10-20T16:58:29Z

rootfs/api/settings/production.py

@@ -24,8 +24,6 @@
    'security.W008'
 ]

-CONN_MAX_AGE = 60 * 3


Does this need to be added back. I think this is a different CONN_MAX_AGE env variable?

Cryptophobia · 2020-10-20T16:59:27Z

Hi @kingdonb , have you tested this? What about setting the GUNICORN_WORKERS env variable or is that captured by python at runtime?

and CONN_MAX_AGE from global values.yaml

The controller values.yaml should mention all of the settings that can affect the controller in values.yaml too.

kingdonb · 2020-11-23T03:08:30Z

Building this image, rebased on the latest controller, gave me a headache (CrashLoopBackOff)

$ k logs -f deis-controller-85b86bfc9f-4xn6s
system information:
Django Version: 1.11.29
Python 3.6.9

Django checks:
System check identified no issues (2 silenced).

Health Checks:
Checking if database is alive
There was a problem connecting to the database
unsupported operand type(s) for +: 'float' and 'str'

I have no idea what caused this, it doesn't look like it's related to my change, but I submitted a separate PR with chart-only change that gets it done. We already had the gunicorn_workers code merged into controller python code. It only needed to be set as an environment variable from the chart/deployment spec.

The loose end in here is the CONN_MAX_AGE thing. I'll keep it open until we figure out if there's a change needed for that.

Cryptophobia · 2020-11-23T22:06:11Z

Seem to be failing at django.db.connection.cursor() in file https://github.com/teamhephy/controller/blob/master/rootfs/api/management/commands/healthchecks.py#L13 .

Do you have some type of user and environment string with a special character. We must be breaking somewhere with the connection string to the db?

teamhephy#75 (review) > I think this is a different CONN_MAX_AGE env variable? I don't see where else this variable is used, but no other change is present which could have caused the tests to fail ?

kingdonb · 2020-11-27T13:46:03Z

Close in favor of #139

chore(oauth2): update user info pipline

kingdonb mentioned this pull request Aug 25, 2018

The ability to define GUNICORN_WORKERS and CONN_MAX_AGE env vars via helm chart's values #74

Open

kingdonb mentioned this pull request Aug 25, 2018

Override GUNICORN_WORKERS and CONN_MAX_AGE teamhephy/workflow#66

Open

Cryptophobia reviewed Oct 31, 2018

View reviewed changes

Cryptophobia reviewed Oct 20, 2020

View reviewed changes

Kingdon Barrett added 4 commits November 22, 2020 21:40

feat(controller): permit setting GUNICORN_WORKERS

8344d5e

and CONN_MAX_AGE from global values.yaml

feat(controller): set CONN_MAX_AGE from env var

11c9e66

feat(controller): document added chart values

a16db5e

The controller values.yaml should mention all of the settings that can affect the controller in values.yaml too.

noop (rerun the tests)

4f93b5d

kingdonb force-pushed the gunicorn-workers-conn-max-age branch from a965de8 to 4f93b5d Compare November 23, 2020 02:41

kingdonb changed the title ~~feat(controller): permit setting GUNICORN_WORKERS~~ XXX feat(controller): permit setting GUNICORN_WORKERS Nov 23, 2020

kingdonb mentioned this pull request Nov 23, 2020

feat(controller): permit setting GUNICORN_WORKERS #137

Merged

kingdonb mentioned this pull request Nov 27, 2020

WIP: Conn max age #139

Closed

kingdonb closed this Nov 27, 2020

duanhongyi added a commit to duanhongyi/controller that referenced this pull request Apr 24, 2023

Merge pull request teamhephy#75 from jianxiaoguo/main

4f6e169

chore(oauth2): update user info pipline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XXX feat(controller): permit setting GUNICORN_WORKERS #75

XXX feat(controller): permit setting GUNICORN_WORKERS #75

kingdonb commented Aug 25, 2018

kingdonb commented Aug 25, 2018

kingdonb commented Aug 25, 2018

Cryptophobia commented Sep 4, 2018 •

edited

Loading

kingdonb commented Sep 4, 2018

kingdonb commented Sep 5, 2018 •

edited

Loading

Cryptophobia commented Sep 5, 2018

Cryptophobia commented Oct 15, 2018

kingdonb commented Oct 21, 2018

Cryptophobia commented Oct 21, 2018 •

edited

Loading

kingdonb commented Oct 26, 2018

Cryptophobia commented Oct 28, 2018 •

edited

Loading

kingdonb commented Oct 31, 2018

Cryptophobia Oct 31, 2018

Cryptophobia commented Oct 31, 2018

duanhongyi commented Nov 1, 2018 •

edited

Loading

Cryptophobia commented Nov 1, 2018

duanhongyi commented Nov 1, 2018

Cryptophobia commented Nov 1, 2018 •

edited

Loading

Cryptophobia Oct 20, 2020

Cryptophobia commented Oct 20, 2020

kingdonb commented Nov 23, 2020

Cryptophobia commented Nov 23, 2020 •

edited

Loading

kingdonb commented Nov 27, 2020

XXX feat(controller): permit setting GUNICORN_WORKERS #75

XXX feat(controller): permit setting GUNICORN_WORKERS #75

Conversation

kingdonb commented Aug 25, 2018

kingdonb commented Aug 25, 2018

kingdonb commented Aug 25, 2018

Cryptophobia commented Sep 4, 2018 • edited Loading

kingdonb commented Sep 4, 2018

kingdonb commented Sep 5, 2018 • edited Loading

Cryptophobia commented Sep 5, 2018

Cryptophobia commented Oct 15, 2018

kingdonb commented Oct 21, 2018

Cryptophobia commented Oct 21, 2018 • edited Loading

kingdonb commented Oct 26, 2018

Cryptophobia commented Oct 28, 2018 • edited Loading

kingdonb commented Oct 31, 2018

Cryptophobia Oct 31, 2018

Choose a reason for hiding this comment

Cryptophobia commented Oct 31, 2018

duanhongyi commented Nov 1, 2018 • edited Loading

Cryptophobia commented Nov 1, 2018

duanhongyi commented Nov 1, 2018

Cryptophobia commented Nov 1, 2018 • edited Loading

Cryptophobia Oct 20, 2020

Choose a reason for hiding this comment

Cryptophobia commented Oct 20, 2020

kingdonb commented Nov 23, 2020

Cryptophobia commented Nov 23, 2020 • edited Loading

kingdonb commented Nov 27, 2020

Cryptophobia commented Sep 4, 2018 •

edited

Loading

kingdonb commented Sep 5, 2018 •

edited

Loading

Cryptophobia commented Oct 21, 2018 •

edited

Loading

Cryptophobia commented Oct 28, 2018 •

edited

Loading

duanhongyi commented Nov 1, 2018 •

edited

Loading

Cryptophobia commented Nov 1, 2018 •

edited

Loading

Cryptophobia commented Nov 23, 2020 •

edited

Loading